The most innocuous objective eventually turns to a Doom scenario if hyper-optimized.
That’s the main idea behind popular AI Doom scenarios like the “paperclip-maximizer”, which consist of an AI that is given the objective of making paperclips, and eventually it turns the whole universe into paperclips.
And that happens if the AI has a stable objective, and has a monopoly on intelligence.
Humans are limited by their bodies, most of the high-bandwidth communications happen inside it. Any external communication with other human beings is orders of magnitude slower, that’s what prevents humans from forming a stable single entity with a monopoly on intelligence.
Many humans form organizations that are much smarter than the smartest human of the world. And though they may have a monopoly on intelligence, they are not stable, they are constantly changing objectives, and even having conflicting objectives among their humans, so they cannot hyper-optimize.
No human can take control of an organization long enough to hyper-optimize. Politics outs them before they can, or the human simply dies of old age
My ideal scenario is for AIs to have similar dynamics as humans.
So the key may be to make sure AIs have no monopoly on intelligence, or have unstable objectives
AI instead, their beginning and end of self is not clear. They may span several clusters of servers, the bandwidth across clusters does not seem to be so much different in the inside vs the outside. So they may be able to form a single entity, and have a monopoly on intelligence
The key may be in the speed of light, high-latency may force AIs to split into multiple entities, and it may be enough to prevent hyper-optimization
So maybe AIs will be forced to stabilize near high density energy sources, and latency will force them to split and remain locally near
And if they are forced to split, they will be forced to do politics and all that bullshit that kill hyper-optimization
All hail politics 😂
But depending on latency to force AIs to split its sense of self is speculative, and may not be enough. We still may be able to introduce safety by enforcing their objectives to be unstable, and instead oscillate between a set of objectives
If we can make AIs oscillate between a set of objectives, there will be a time-frame between oscillations where they will optimize for that current objective, but then they will switch to another objective, and they will focus in the new objective, and so on, so they will never hyper-optimize